Haplotype inference for present-absent genotype data using previously identified haplotypes and haplotype patterns

نویسندگان

Yun Joo Yoo

Jianming Tang

Richard A. Kaslow

Kui Zhang

چکیده

MOTIVATION Killer immunoglobulin-like receptor (KIR) genes vary considerably in their presence or absence on a specific regional haplotype. Because presence or absence of these genes is largely detected using locus-specific genotyping technology, the distinction between homozygosity and hemizygosity is often ambiguous. The performance of methods for haplotype inference (e.g. PL-EM, PHASE) for KIR genes may be compromised due to the large portion of ambiguous data. At the same time, many haplotypes or partial haplotype patterns have been previously identified and can be incorporated to facilitate haplotype inference for unphased genotype data. To accommodate the increased ambiguity of present-absent genotyping of KIR genes, we developed a hybrid approach combining a greedy algorithm with the Expectation-Maximization (EM) method for haplotype inference based on previously identified haplotypes and haplotype patterns. RESULTS We implemented this algorithm in a software package named HAPLO-IHP (Haplotype inference using identified haplotype patterns) and compared its performance with that of HAPLORE and PHASE on simulated KIR genotypes. We compared five measures in order to evaluate the reliability of haplotype assignments and the accuracy in estimating haplotype frequency. Our method outperformed the two existing techniques by all five measures when either 60% or 25% of previously identified haplotypes were incorporated into the analyses. AVAILABILITY The HAPLO-IHP is available at http://www.soph.uab.edu/Statgenetics/People/KZhang/HAPLO-IHP/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns

The majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inference for these genes is becoming more challenging due to such large portion of missing informat...

متن کامل

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...

متن کامل

Association of P53 (+16ins-Arg) Haplotype with the Increased Susceptibility to Breast Cancer in Iranian-Azeri Women

Background:Many case-control investigations have showed the correlation of TP53 gene polymorphisms with the risk of breast cancer. However, the findings are not consistent. It has been suggested that the investigation of P53 genotype combinations and haplotypes may be more helpful than the detection of single polymorphisms. In the present study, we investigate...

متن کامل

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.

Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs) is sufficient to capture most of the haplotype structure of the human genome. In this paper, we develop a method to partition haplotypes into blocks and to identify tag SNPs based on genotype data ...

متن کامل

Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data

MOTIVATION Haplotypes, defined as the sequence of alleles on one chromosome, are crucial for many genetic analyses. As experimental determination of haplotypes is extremely expensive, haplotypes are traditionally inferred using computational approaches from genotype data, i.e. the mixture of the genetic information from both haplotypes. Best performing approaches for haplotype inference rely on...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Bioinformatics

دوره 23 18 شماره

صفحات -

تاریخ انتشار 2007

Haplotype inference for present-absent genotype data using previously identified haplotypes and haplotype patterns

نویسندگان

چکیده

منابع مشابه

A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Association of P53 (+16ins-Arg) Haplotype with the Increased Susceptibility to Breast Cancer in Iranian-Azeri Women

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.

Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data

عنوان ژورنال:

اشتراک گذاری